AITopics | design quality

Collaborating Authors

design quality

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Towards a Comprehensive Benchmark for High-Level Synthesis Targeted to FPGAs

Neural Information Processing SystemsDec-26-2025, 07:57:39 GMT

comprehensive benchmark, high-level synthesis targeted, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.36)

Add feedback

BikeBench: A Bicycle Design Benchmark for Generative Models with Objectives and Constraints

Regenwetter, Lyle, Obaideh, Yazan Abu, Chiotti, Fabien, Lykourentzou, Ioanna, Ahmed, Faez

arXiv.org Artificial IntelligenceOct-28-2025

We introduce BikeBench, an engineering design benchmark for evaluating generative models on problems with multiple real-world objectives and constraints. As generative AI's reach continues to grow, evaluating its capability to understand physical laws, human guidelines, and hard constraints grows increasingly important. Engineering product design lies at the intersection of these difficult tasks, providing new challenges for AI capabilities. BikeBench evaluates AI models' capabilities to generate bicycle designs that not only resemble the dataset, but meet specific performance objectives and constraints. To do so, BikeBench quantifies a variety of human-centered and multiphysics performance characteristics, such as aerodynamics, ergonomics, structural mechanics, human-rated usability, and similarity to subjective text or image prompts. Supporting the benchmark are several datasets of simulation results, a dataset of 10,000 human-rated bicycle assessments, and a synthetically generated dataset of 1.6M designs, each with a parametric, CAD/XML, SVG, and PNG representation. BikeBench is uniquely configured to evaluate tabular generative models, large language models (LLMs), design optimization, and hybrid algorithms side-by-side. Our experiments indicate that LLMs and tabular generative models fall short of hybrid GenAI+optimization algorithms in design quality, constraint satisfaction, and similarity scores, suggesting significant room for improvement. We hope that BikeBench, a first-of-its-kind benchmark, will help catalyze progress in generative AI for constrained multi-objective engineering design problems. We provide code, data, an interactive leaderboard, and other resources at https://github.com/Lyleregenwetter/BikeBench.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2508.0083

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Automobiles & Trucks (0.69)
Health & Medicine (0.46)
Leisure & Entertainment > Sports > Cycling (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.69)

Add feedback

UI-Bench: A Benchmark for Evaluating Design Capabilities of AI Text-to-App Tools

Jung, Sam, Garcinuno, Agustin, Mateega, Spencer

arXiv.org Artificial IntelligenceSep-5-2025

AI text-to-app tools promise high quality applications and websites in minutes, yet no public benchmark rigorously verifies those claims. We introduce UI-Bench, the first large-scale benchmark that evaluates visual excellence across competing AI text-to-app tools through expert pairwise comparison. Spanning 10 tools, 30 prompts, 300 generated sites, and 4,000+ expert judgments, UI-Bench ranks systems with a TrueSkill-derived model that yields calibrated confidence intervals. UI-Bench establishes a reproducible standard for advancing AI-driven web design. We release (i) the complete prompt set, (ii) an open-source evaluation framework, and (iii) a public leaderboard. The generated sites rated by participants will be released soon. View the UI-Bench leaderboard at https://uibench.ai/leaderboard.

benchmark, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2508.2041

Genre: Research Report > New Finding (0.68)

Industry:

Information Technology (0.47)
Health & Medicine (0.46)
Banking & Finance (0.46)

Technology:

Information Technology > Communications > Web (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Towards a Comprehensive Benchmark for High-Level Synthesis Targeted to FPGAs

Neural Information Processing SystemsJan-19-2025, 14:30:54 GMT

High-level synthesis (HLS) aims to raise the abstraction layer in hardware design, enabling the design of domain-specific accelerators (DSAs) like field-programmable gate arrays (FPGAs) using C/C instead of hardware description languages (HDLs). Compiler directives in the form of pragmas play a crucial role in modifying the microarchitecture within the HLS framework. However, the space of possible microarchitectures grows exponentially with the number of pragmas. To accelerate this process, machine learning models have been used to predict design quality in milliseconds. However, existing open-source datasets for training such models are limited in terms of design complexity and available optimizations. It contains more complex programs with a wider range of optimization pragmas, making it a comprehensive dataset for training and evaluating design quality prediction models.

comprehensive benchmark, fpga, high-level synthesis targeted, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.79)

Add feedback

What makes a good BIM design: quantitative linking between design behavior and quality

Ni, Xiang-Rui, Pan, Peng, Lin, Jia-Rui

arXiv.org Artificial IntelligenceNov-14-2024

In the Architecture Engineering & Construction (AEC) industry, how design behaviors impact design quality remains unclear. This study proposes a novel approach, which, for the first time, identifies and quantitatively describes the relationship between design behaviors and quality of design based on Building Information Modeling (BIM). Real-time collection and log mining are integrated to collect raw data of design behaviors. Feature engineering and various machine learning models are then utilized for quantitative modeling and interpretation. Results confirm an existing quantifiable relationship which can be learned by various models. The best-performing model using Extremely Random Trees achieved an R2 value of 0.88 on the test set. Behavioral features related to designer's skill level and changes of design intentions are identified to have significant impacts on design quality. These findings deepen our understanding of the design process and help forming BIM designs with better quality.

design behavior, design quality, designer, (15 more...)

arXiv.org Artificial Intelligence

2411.09481

Country:

Asia > China > Beijing > Beijing (0.04)
Europe > Portugal > Faro > Faro (0.04)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.93)

Industry:

Construction & Engineering (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

UIClip: A Data-driven Model for Assessing User Interface Design

Wu, Jason, Peng, Yi-Hao, Li, Amanda, Swearngin, Amanda, Bigham, Jeffrey P., Nichols, Jeffrey

arXiv.org Artificial IntelligenceApr-18-2024

User interface (UI) design is a difficult yet important task for ensuring the usability, accessibility, and aesthetic qualities of applications. In our paper, we develop a machine-learned model, UIClip, for assessing the design quality and visual relevance of a UI given its screenshot and natural language description. To train UIClip, we used a combination of automated crawling, synthetic augmentation, and human ratings to construct a large-scale dataset of UIs, collated by description and ranked by design quality. Through training on the dataset, UIClip implicitly learns properties of good and bad designs by i) assigning a numerical score that represents a UI design's relevance and quality and ii) providing design suggestions. In an evaluation that compared the outputs of UIClip and other baselines to UIs rated by 12 human designers, we found that UIClip achieved the highest agreement with ground-truth rankings. Finally, we present three example applications that demonstrate how UIClip can facilitate downstream applications that rely on instantaneous assessment of UI design quality: i) UI code generation, ii) UI design tips generation, and iii) quality-aware UI example search.

dataset, screenshot, uiclip, (16 more...)

arXiv.org Artificial Intelligence

2404.125

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

RTLLM: An Open-Source Benchmark for Design RTL Generation with Large Language Model

Lu, Yao, Liu, Shang, Zhang, Qijun, Xie, Zhiyao

arXiv.org Artificial IntelligenceNov-11-2023

Inspired by the recent success of large language models (LLMs) like ChatGPT, researchers start to explore the adoption of LLMs for agile hardware design, such as generating design RTL based on natural-language instructions. However, in existing works, their target designs are all relatively simple and in a small scale, and proposed by the authors themselves, making a fair comparison among different LLM solutions challenging. In addition, many prior works only focus on the design correctness, without evaluating the design qualities of generated design RTL. In this work, we propose an open-source benchmark named RTLLM, for generating design RTL with natural language instructions. To systematically evaluate the auto-generated design RTL, we summarized three progressive goals, named syntax goal, functionality goal, and design quality goal. This benchmark can automatically provide a quantitative evaluation of any given LLM-based solution. Furthermore, we propose an easy-to-use yet surprisingly effective prompt engineering technique named self-planning, which proves to significantly boost the performance of GPT-3.5 in our proposed benchmark.

benchmark, functionality, gpt-3, (12 more...)

arXiv.org Artificial Intelligence

2308.05345

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Automatic Assessment of the Design Quality of Python Programs with Personalized Feedback

Orr, J. Walker, Russell, Nathaniel

arXiv.org Artificial IntelligenceJun-2-2021

The assessment of program functionality can generally be accomplished with straight-forward unit tests. However, assessing the design quality of a program is a much more difficult and nuanced problem. Design quality is an important consideration since it affects the readability and maintainability of programs. Assessing design quality and giving personalized feedback is very time consuming task for instructors and teaching assistants. This limits the scale of giving personalized feedback to small class settings. Further, design quality is nuanced and is difficult to concisely express as a set of rules. For these reasons, we propose a neural network model to both automatically assess the design of a program and provide personalized feedback to guide students on how to make corrections. The model's effectiveness is evaluated on a corpus of student programs written in Python. The model has an accuracy rate from 83.67% to 94.27%, depending on the dataset, when predicting design scores as compared to historical instructor assessment. Finally, we present a study where students tried to improve the design of their programs based on the personalized feedback produced by the model. Students who participated in the study improved their program design scores by 19.58%.

assignment, neural network, student program, (15 more...)

arXiv.org Artificial Intelligence

2106.01399

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)

Genre: Research Report > Experimental Study (0.68)

Industry:

Education > Educational Setting (0.66)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Quantitatively Assessing the Benefits of Model-driven Development in Agent-based Modeling and Simulation

Santos, Fernando, Nunes, Ingrid, Bazzan, Ana L. C.

arXiv.org Artificial IntelligenceJun-15-2020

The agent-based modeling and simulation (ABMS) paradigm has been used to analyze, reproduce, and predict phenomena related to many application areas. Although there are many agent-based platforms that support simulation development, they rely on programming languages that require extensive programming knowledge. Model-driven development (MDD) has been explored to facilitate simulation modeling, by means of high-level modeling languages that provide reusable building blocks that hide computational complexity, and code generation. However, there is still limited knowledge of how MDD approaches to ABMS contribute to increasing development productivity and quality. We thus in this paper present an empirical study that quantitatively compares the use of MDD and ABMS platforms mainly in terms of effort and developer mistakes. Our evaluation was performed using MDD4ABMS-an MDD approach with a core and extensions to two application areas, one of which developed for this study-and NetLogo, a widely used platform. The obtained results show that MDD4ABMS requires less effort to develop simulations with similar (sometimes better) design quality than NetLogo, giving evidence of the benefits that MDD can provide to ABMS.

artificial intelligence, participant, simulation, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.simpat.2020.102126

2006.0882

Country:

South America > Brazil > São Paulo (0.04)
South America > Brazil > Santa Catarina (0.04)
South America > Brazil > Rio Grande do Sul > Porto Alegre (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)

Industry:

Information Technology (0.68)
Health & Medicine > Therapeutic Area (0.46)
Health & Medicine > Epidemiology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback